Fraud
Out-of-Distribution Generalized Graph Anomaly Detection with Homophily-aware Environment Mixup
Graph anomaly detection (GAD) is widely prevalent in scenarios such as financial fraud detection, anti-money laundering and social bot detecion. However, structural distribution shifts are commonly observed in real-world GAD data due to selection bias, resulting in reduced homophily. Existing GAD methods tend to rely on homophilic shortcuts when trained on high-homophily structures, limiting their ability to generalize well to data with low homophily under structural distribution shifts. In this study, we propose to handle structural distribution shifts by generating novel environments characterized by diverse homophilic structures and utilizing invariant patterns, i.e., features and structures with the capability of stable prediction across structural distribution shifts, which face two challenges: (1) How to discover invariant patterns from entangled features and structures, as structures are sensitive to varying homophilic distributions.
Safety Pretraining: Toward the Next Generation of Safe AI
As large language models (LLMs) are increasingly deployed in high-stakes settings, the risk of generating harmful or toxic content remains a central challenge. Post-hoc alignment methods are brittle: once unsafe patterns are learned during pretraining, they are hard to remove. In this work, we present a data-centric pretraining framework that builds safety into the model from the start. Our framework consists of four key steps: (i) Safety Filtering: building a safety classifier to classify webdata into safe and unsafe categories; (ii) Safety Rephrasing: we recontextualize unsafe webdata into safer narratives; (iii) Native Refusal: we synthetically generate pretraining datasets that actively teach models to refuse on unsafe content and the moral reasoning behind it, and (iv) Harmfulness-Tag annotated pretraining: we flag unsafe content during pretraining using a special token, and use it to steer models away from unsafe generations at inference-time. Our safety-pretrained models reduce attack success rates from 38.8% to 8.4% on standard LLM safety benchmarks with no performance degradation on general tasks.
Alignment of Large Language Models with Constrained Learning
We study the problem of computing an optimal large language model (LLM) policy for the constrained alignment problem, where the goal is to maximize a primary reward objective while satisfying constraints on secondary utilities. Despite the popularity of Lagrangian-based LLM policy search in constrained alignment, iterative primal-dual methods often fail to converge, and non-iterative dual-based methods do not achieve optimality in the LLM parameter space. To address these challenges, we employ Lagrangian duality to develop an iterative dual-based alignment method that alternates between updating the LLM policy via Lagrangian maximization and updating the dual variable via dual descent. In theory, we characterize the primal-dual gap between the primal value in the distribution space and the dual value in the LLM parameter space. We further quantify the optimality gap of the learned LLM policies at near-optimal dual variables with respect to both the objective and the constraint functions. These results prove that dual-based alignment methods can find an optimal constrained LLM policy, up to an LLM parametrization gap. We demonstrate the effectiveness and merits of our approach through extensive experiments conducted on the PKU-SafeRLHF and Anthropic HH-RLHF datasets.
https://papers.nips.cc/paper_files/paper/2025/file/09265e2568cf7a6ff47b506acbc2c6eb-Paper-Conference.pdf
Fraudulent activities have caused substantial negative social impacts and are exhibiting emerging characteristics such as intelligence and industrialization, posing challenges of high-order interactions, intricate dependencies, and the sparse yet concealed nature of fraudulent entities. Existing graph fraud detectors are limited by their narrow "receptive fields", as they focus only on the relations between an entity and its neighbors while neglecting longer-range structural associations hidden between entities. To address this issue, we propose a novel fraud detector based on Graph Path Aggregation (GPA). It operates through variable-length path sampling, semantic-associated path encoding, path interaction and aggregation, and aggregation-enhanced fraud detection. To further facilitate interpretable association analysis, we synthesize G-Internet, the first benchmark dataset in the field of internet fraud detection. Extensive experiments across datasets in multiple fraud scenarios demonstrate that the proposed GPA outperforms mainstream fraud detectors by up to +15% in Average Precision (AP). Additionally, GPA exhibits enhanced robustness to noisy labels and provides excellent interpretability by uncovering implicit fraudulent patterns across broader contexts.
Association-Focused Path Aggregation for Graph Fraud Detection
Fraudulent activities have caused substantial negative social impacts and are exhibiting emerging characteristics such as intelligence and industrialization, posing challenges of high-order interactions, intricate dependencies, and the sparse yet concealed nature of fraudulent entities. Existing graph fraud detectors are limited by their narrow receptive fields, as they focus only on the relations between an entity and its neighbors while neglecting longer-range structural associations hidden between entities. To address this issue, we propose a novel fraud detector based on Graph Path Aggregation (GPA). It operates through variable-length path sampling, semantic-associated path encoding, path interaction and aggregation, and aggregation-enhanced fraud detection. To further facilitate interpretable association analysis, we synthesize G-Internet, the first benchmark dataset in the field of internet fraud detection. Extensive experiments across datasets in multiple fraud scenarios demonstrate that the proposed GPA outperforms mainstream fraud detectors by up to +15% in Average Precision (AP). Additionally, GPA exhibits enhanced robustness to noisy labels and provides excellent interpretability by uncovering implicit fraudulent patterns across broader contexts.
Causal Label Recovery in Payment Networks
Fraud detection models in payment networks train on chargeback labels that are systematically biased. Every label must survive three sequential gates: authorization (declined transactions generate no labels), issuer reporting (unreported fraud is invisible), and delay (pending chargebacks are missing at training time). Labels that do arrive may be corrupted by first-party misuse or issuer misclassification. A companion paper [arXiv:2605.27557] proved that these four impairments impose a minimax lower bound on detection performance. This paper asks: can that bound be achieved? We formalize the observation pipeline as a sequential missing-data problem with three propensity stages and a corruption layer, and construct the Sequential Triply Robust (STR) estimator. The STR corrects for all four impairments simultaneously and achieves the semiparametric efficiency bound -- no estimator can have lower asymptotic variance. It is sequentially triply robust: at each gate, consistency requires only that either the propensity model or the outcome regression is correctly specified, not both. We provide corruption correction via noise-rate-adjusted pseudo-labels, empirical Bayes shrinkage to stabilize inverse-propensity weights for small issuers, a plug-in variance estimator yielding valid confidence intervals, and a Bernstein concentration inequality for finite-sample guarantees. On the operational side, we derive the optimal training delay -- the maturity window that minimizes the sum of label-quality loss and model staleness -- and prove that the STR permits training on data that is days old rather than months old, decoupling model freshness from the chargeback maturity cycle. The STR provably dominates naive chargeback-based training in mean squared error for any sample size.
The Fundamental Limits of Fraud Detection in Card Payment Networks
Card payment fraud detection is usually framed as a supervised classification problem. Although this approach has generated practical progress, improvement has remained incremental despite major advances in model architecture. We argue that this is not mainly a failure of function approximation or optimization, but a consequence of structural information impairments inherent to the payment ecosystem. We formalize card authorization as a sequential decision problem with delayed, censored, corrupted, and counterfactually missing feedback. We derive a minimax regret lower bound showing that these impairments enter multiplicatively in the denominator of the achievable learning rate. The bound implies that improving issuer reporting quality or reducing censorship can yield larger reductions in the regret floor than increasing model complexity. We also show that heterogeneity across issuers worsens learnability beyond what average impairment rates suggest. The paper contributes a theory of why fraud detection in payment networks is fundamentally harder than in standard online learning settings, identifies ecosystem information quality as the key bottleneck, and provides a theoretical basis for prioritizing investments in reporting infrastructure, dispute process quality, and selective exploration. The paper is theory-first and does not rely on proprietary transaction data.
EmDT: Embedding Diffusion Transformer for Tabular Data Generation in Fraud Detection
Imbalanced datasets pose a difficulty in fraud detection, as classifiers are often biased toward the majority class and perform poorly on rare fraudulent transactions. Synthetic data generation is therefore commonly used to mitigate this problem. In this work, we propose the Clustered Embedding Diffusion-Transformer (EmDT), a diffusion model designed to generate fraudulent samples. Our key innovation is to leverage UMAP clustering to identify distinct fraudulent patterns, and train a Transformer denoising network with sinusoidal positional embeddings to capture feature relationships throughout the diffusion process. Once the synthetic data has been generated, we employ a standard decision-tree-based classifier (e.g., XGBoost) for classification, as this type of model remains better suited to tabular datasets. Experiments on a credit card fraud detection dataset demonstrate that EmDT significantly improves downstream classification performance compared to existing oversampling and generative methods, while maintaining comparable privacy protection and preserving feature correlations present in the original data.